November 23, 2020

Recap

Just days before the election, my final forecast went against the wisdom of professional forecasters and pollsters alike and projected a rail-thin electoral margin for Joe Biden. While the election results surprised many people on the night of November 3, my model’s point prediction anticipated an even closer race in the electoral college–273 electoral votes for Biden compared to his actual 306–but a wider spread in the popular vote–52.8% compared to his actual 51.9%.

Accuracy and Patterns

The statistical aphorism that “all models are wrong, but some are useful” served as my guiding philosophy in constructing this model. As I discussed in my final prediction, I did not expect this model to perfectly forecast all outcomes in the election. Rather, this forecast aimed to provide a range of state-level probabilities and outcomes. Then, I used the most probable state-level outcomes to produce point predictions for the Electoral College and national popular vote. While these numbers could be interpreted as my “final prediction”, I would have been incredibly shocked if the exact outcomes matched perfectly due to the wide amount of variability in my simulations.

The actual Electoral College outcome, with each candidate winning the states that they won on Election Night, occurred in 53 of my simulations. To put that into perspective, my point prediction occurred in 5080 of my simulations, which equates to 0.051%. Forecasters cannot predict the election outcome with absolute certainty, but models provide a range of possible scenarios. This model successfully anticipated a close Electoral Race with a large popular vote margin, and the actual outcome occurred more than a handful of times in my simulations.

All in all, I’m quite happy with how this model paralleled with the election outcomes. It only misclassified the winner of GA, NV, and AZ, which were three of the final states called. Even though the model predicted that a Donald Trump victory was more likely in these states, the forecast predicted a close race in those states and gave either candidate a fair shot of winning–Joe Biden won GA, NV, and AZ in 19.2%, 43.9%, and 20.5% of simulations, respectively. In my nationwide election simulations, the exact election outcome occurred in 57 out of 100,000 (0.057%) simulations. Interpreting these probabilities with a frequentist1 approach, those probabilities could have very well been correct and we just happened to observe one of the 53 elections where each candidate won this exact cocktail of states.

With a correlation of 0.9608654 between the actual and the predicted two-party popular vote for each state, there is an incredibly strong correlation between the actual and predicted state-level two-party vote shares. With that said, there are a few patterns in the inaccuracies:

The below maps illustrate the areas with the greatest error. Notice that safe blue and red states such as New York and Louisiana have relatively large errors, while battleground states such as Texas and Ohio have extremely slim errors. For a closer look at the data, the included table contains all of the actual and predicted two-party vote shares for Joe Biden, ordered by the magnitude of the error:

Predicted Actual Results

Error Map

State Actual Democratic Two-Party Vote Share Predicted Democratic Two-Party Vote Share Error
NY 57.24493 69.61151 -12.3665780
RI 60.50962 69.52752 -9.0179003
HI 65.03633 72.32903 -7.2927031
LA 40.53556 33.93166 6.6039010
SC 44.07339 37.81854 6.2548501
DE 59.62674 65.87647 -6.2497293
AR 35.78478 29.62813 6.1566468
AK 44.71671 40.02248 4.6942271
NJ 58.00345 62.50579 -4.5023339
CA 65.06225 69.50769 -4.4454369
CT 60.17662 64.60562 -4.4289997
WA 59.95101 64.18870 -4.2376851
ND 32.78259 36.80377 -4.0211801
OR 58.33341 62.25845 -3.9250389
MA 66.86435 70.77922 -3.9148649
NE 40.24716 44.02699 -3.7798311
WV 30.20202 33.80620 -3.6041767
KS 42.25143 38.65813 3.5932950
MN 53.63371 50.05711 3.5765990
MS 40.59077 37.20396 3.3868054
GA 50.14253 47.01611 3.1264220
ME 55.14282 52.09217 3.0506449
SD 36.56522 39.42158 -2.8563608
AZ 50.15683 47.34474 2.8120875
MT 41.60337 38.79975 2.8036197
MO 42.06354 39.50587 2.5576671
AL 37.03289 34.62090 2.4119937
KY 36.79758 34.47128 2.3263011
IN 41.79335 39.56968 2.2236715
VA 55.15706 57.35345 -2.1963895
NV 51.22312 49.36777 1.8553463
TN 38.11647 36.32844 1.7880338
NM 55.51576 53.81917 1.6965921
CO 56.93974 58.48604 -1.5463005
UT 39.30687 37.82175 1.4851174
NC 49.31589 48.10486 1.2110359
IL 58.45714 59.62805 -1.1709124
IA 45.81652 46.91440 -1.0978824
VT 68.29919 67.26910 1.0300846
MI 51.36015 50.53204 0.8281138
NH 53.74888 53.06014 0.6887330
WY 27.51957 26.85758 0.6619846
FL 48.30525 48.95294 -0.6476975
MD 66.62085 67.16643 -0.5455827
OK 33.05996 32.60532 0.4546372
TX 47.12886 46.69227 0.4365935
ID 34.12328 33.75413 0.3691501
OH 45.85491 45.99202 -0.1371140
PA 50.60314 50.68815 -0.0850104
WI 50.31728 50.35326 -0.0359886

Since this model was not unilaterally biased like most other forecast models, this model’s average error is considerably closer to zero than other popular forecasts, and the errors are more normally distributed around zero:

Comparison Summary Statistics

Model Mean Error Root Mean Squared Error Classification Accuracy Missed States
Kayla Manning -0.2804308 3.874984 94 AZ, GA, NV
The Economist -2.3310087 2.803927 96 FL, NC
FiveThirtyEight -2.4447961 3.019431 96 FL, NC

Error Histograms

Hypothesis for Inaccuracies

As with any forecast model that incorporated polls, this forecast would have benefited from improved polling accuracy. Unfortunately, I do not control the polling methodology, so I must improve my model in other ways. I applied an aggressive weighting scheme based on FiveThirtyEight’s pollster grades in an attempt to control for polling bias. In spite of these efforts, the model still produced extreme predictions in either direction, with a more favorable outcome for Biden in the liberal states and predicted a more favorable outcome for Trump in conservative states. The diverging direction of the inaccuracies leads me to consider other potential causes for the inaccuracies and potential improvements for future iterations of this model.

This model neglected to pick up on the magnitude of changing views in states such as Arizona and Georgia, both of which voted for Trump in 2016 yet voted for Biden in 2020.2 To account for this in 2024 and beyond, I could include a variable that captures shifting partisanship within a state between elections. In this model, I attempted to use demographic changes as a proxy for this, but a more direct variable might work better. I plan on incorporating a “difference in Democratic vote share” variable in future iterations of this model, which looks at the difference in the share of that state’s two-party popular vote in the two previous elections. For 2024, I would each state’s Democratic vote share in 2016 from the Democratic vote share in 2020. Negative numbers indicate Republican trends and positive numbers would indicate Democratic shifts, with larger absolute values indicating a shift of greater magnitude.

Proposed Test to Assess Hypothesis

To assess this hypothesis, I could reconstruct the model, following the same procedures as outlined in my final prediction. I would use the same data from 1992-2016 and include this variable that captures the state-level changes in voting patterns between elections. Once I have constructed this new model, I could assess its validity in several ways:

  • First and foremost, I would assess the statistical significance of the model coefficients for the partisan change.
  • Then, I could assess the out-of-sample fit with a leave-one-out cross-validation and compare the classification accuracy with that of my previous model.
  • If both of those steps support the strength of this new model, I could forecast the 2020 results using this year’s data. To remain consistent between the two models, I would not use polls from after 3 PM EST on November 1, which is the last time I ran the previous forecast model.

Finally, I would compare this model’s 2020 forecast to my previous model. If this model more accurately predicted the state-level outcomes, then I know that my 2024 should resemble this newer model. However, if my previous model performed better in the leave-one-out classification and on the 2020 data, then I would stick with my original, more parsimonious model for the future.

Improvements for Future Iterations

Aside from the lack of a variable to capture shifting partisan alignment within states, there are several other modifications I would make to the methodology behind this model in a future iteration. I touched on many of these in greater detail in my final prediction post, but here is a brief overview:

  • This model does not include Washington D.C. in the forecast, and I manually added its 3 electoral votes after forecasting the vote shares for the 50 states. Ideally, I would find the necessary data to include D.C. in my forecast.
  • Also due to the absence of the appropriate district-level data, this model allocates the electoral votes from Maine and Nebraska on a winner-take-all basis rather than following the congressional district method, as they do in reality. Again, future iterations would ideally include district-level data for these states.
  • This model varied voter turnout and partisan probabilities independently by simply drawing from a normal distribution. A more sophisticated model in the future would introduce some correlation between geographies, demographic groups, and ideologies. Moreover, since I drew these probabilities from a normal distribution, some states could have negative probabilities if the initial probability for voting for a particular party was extremely low (e.g. voting Republican in Hawaii). I mitigated this by taking the absolute value of the probability, but this introduced some extreme variation in the model and I must find a better method to restrict the domain in future iterations of this model.
  • Lastly, I classified states based on their 2020 ideologies. Ideally, I would re-classify each state in every election year when constructing the model. For example, this model considered Colorado as a “blue state” for all years based on its 2020 classification, but it was either a “red state” or “battleground state” in most of the previous elections in the data. In the future, I would like to set a rule for classifying each state for every election, rather than relying solely on the 2020 classification by the New York Times.

Conclusion

While my forecast did not perfectly predict the election outcomes, this model correctly projected a relatively close race in the Electoral College with a larger margin in the popular vote. Furthermore, the outcomes of November 3 all reasonably match the probabilities assigned by the model. Even in GA, NV, and AZ–the three misclassified states–the actual vote shares were not too far from the predictions, and both candidates had a fair probability of winning those states. Despite having predicted this election exceptionally well when many models did not, future iterations of this model must do a better job at accounting for partisan shifts within states (assuming this country survives to see another 4 years).



  1. Unlike rolling dice, we cannot experience multiple occurrences of the same election to uncover the true probability of each event. Frequentist probability describes the relative frequency of an event in many trials; conducting many simulations in my model took a frequentist approach to uncover the probability of each outcome. However, we can never really know if any of the probabilities were correct because the 2020 election only happened once (thank goodness!). Trying to say whether or not a probabilistic forecast was correct is like rolling a “six” on a single die and concluding that your prior probabilities of 1/6 for rolling a 6 and 5/6 for rolling anything else were incorrect because you observed the less probable outcome on a single iteration.↩︎

  2. However, any changes would have to keep in mind that FL, OH, WI, etc. were more conservative than most forecasts anticipated, and this model correctly anticipated the winner in these highly contentious battleground states.↩︎